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The eukaryotic protein kinases comprise one of the 
largest superfamilies of homologous proteins and 
genes. Within this family, there are now hundreds of 
different members whose sequences are known. Al- 
though there is a rich diversity of structures, regulation 
modes, and substrate specificities among the protein 
kinases, there are also common structural features. 
These conserved structural motifs provide clear indica- 
tions as to how these enzymes manage to transfer the 
f phosphate of a purine nucleotide triphosphate to the 
hydroxy! groups of their protein substrates. The 
authors of this review have carried out a monumental 
task of analyzing and collating the amino acid se- 
quences of all reported protein kinases and defining 
the conserved structural features that characterize the 
portion of these proteins that is responsible for their 
catalytic activity. Comparison of the sequences in the 
catalytic fragment of the protein kinases has been used 
to arrange these enzymes in evolutionary trees that 



group subfamilies of closely related enzymes. It is com- 
forting that the structural relationships that emerge 
from these trees result in groupings that also reflect 
related functions. The work presented in this review 
seems to be an excellent example of the type of analy- 
sis that will become indispensable in the coming years, 
as more and more sequence information become avail- 
able to biologists as a result of the genome projects. 



1 kinases make up a 



The eukaryotic pn 
large superfamily of homologous proteins. They are "re- 
lated by virtue of their kinase domains (also known as 
catalytic domains), which consist of -250-300 amino acid 
residues. The kinase domains that define this group of 
enzymes contain 12 conserved subdomains that fold into 
a common catalytic core structure, as revealed by the 
3-dimensional structures of several protein-serine ld- 



famhy: the protein-serine/threonine kinases and the 
protein-tyrosine kinases. A classification scheme can be 
founded on a kinase domain phytogeny, which reveals 
families of enzymes that have related substrate specifici- 
ties and modes of regulation. -Hanks, S. K., Hunter, T. 
The eukaryotic protein kinase superfamily: kinase (cata- 
lytic) domain structure and classification. FASEB J. 9, 
576-596 (1995) 
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sources. (The term superfamily will be used here to dis- 
tinguish this broad collection of enzymes from smaller, 
more closely related subsets that have been commonly 
referred to as families). These enzymes use the y-phos- 
phate of ATP (or GTP) to generate phosphate 
monoesters using protein alcohol groups (on Ser and 
Thr) and/or protein phenolic groups (on Tyr) as phos- 
phate acceptors. The protein kinases are related by virtue 
of their homologous kinase domains (also known as cata- 
lytic domains), which consist of "250-300 amino add 
residues (reviewed in refs 1-3; and see below). During the 
past 15 years, previously unrecognized members of the 
eukaryotic protein kinase superfamily have been uncov- 
ered at an exponentially increasing rate and currently 
appear in the literature almost weekly. This pace of dis- 
covery can be attributed to the past development of mo- 
lecular cloning and sequencing technologies and, more 
recently, to the advent of the polymerase chain reaction 
(PCR), S which facilitated the use of homology-based clon- 
ing strategies. Consequently, about 200 different superfa- 
mily members (products of distinct paralogous genes) 
had been recognized from mammalian sources alone! 
The prediction made several years ago (4) that the mam- 
malian genome contains about 1000 protein kinase genes 
(roughly 1% of all genes) would still appear to be within 
reason, and may even be an underestimate (5). 

In addition to mammals and other vertebrates, eu- 
karyotic protein kinase superfamily members have been 
identified and characterized from a wide range of other 
animal phyla as well as from plants, fungi, and protozo- 
ans. Hence, the protein kinase progenitor gene can be 
traced back to a time before the evolutionary separation 
of the major eukaryotic kingdoms. The identification of 
eukaryotic-like protein kinase genes in prokaryotes (6, 7) 
raises the possibility that the protein kinase progenitor 
gene might have arisen before the divergence of 
prokaryotes and eukaryotes (see below). Studies of the 
budding and fission yeasts, Saccharomyces cerevisiae and 
Schizosaccharomyces pombe, have been particularly fruitful 
in the recognition of new protein kinases. In these geneti- 



One of the 
up of protein 



known protein superfamilies is made 
identified largely from eukaryotic 



tiapter in 

Kinase Factsbook, edited by D. G. Hardie arid S. K. Hanks, publish- 
ed in 1995 by Academic Press, London. 

*To whom correspondence and reprint requests should be 
addressed, at: Molecular Biology and Virology Laboratory, The 
Salk Institute, 10010 N. Torrey Pines Rd., Lajolla, CA 92037, 
USA 

'Abbreviations: PCR, polymerase chain reaction; PKA-Cot, 
type a cAMP-dependent protein kinase catalytic subunit; Cdk2, 
cyclin-dependent kinase 2; Erk2. p42 MAP kinase; APE, 



0a92-6638/95/0009-0576/$01.50. © FASEB 



EXHIBIT D 



SERIAL REVIEW 



cally tractable organisms, the powerful approach of mu- 
tant isolation and cloning by complementation has netted 
dozens of protein kinase genes required for numerous 
aspects of cell function (8). In many cases, vertebrate 
counterparts have now been found for these genes, lead- 
ing to a growing awareness that protein phosphorylation 
pathways that regulate basic aspects of cell physiology 
have been maintained throughout the course of eu- 
karyotic evolution. 

Even though the overwhelming majority of protein ki- 
nases identified from eukaryotic sources belong to this 
superfamily, a small but growing number of such enzymes 
do not qualify as superfamily members. Most of these are 
related to the prokaryouc protein-histidine kinase family 
(see below), which forms the sensor components of two- 
component signal transduction systems (9). Included in 
this category are a putative ethylene receptor encoded by 
the flowering plant ETR1 gene (10), the product of the 
budding yesat SLNl gene (11, 12) thought to be involved 
in relaying nutrient information to elements controlling 
cell growth and division, the mitochondrial 
branched-chain ot-ketoacid dehydrogenase kinase (13), 
and the mitochondrial pyruvate dehydrogenase kinase 
(14). In prokaryotes, protein-histidine kinases phosphory- 
late aspartates in their target proteins, but except for the 
two dehydrogenase kinases that phosphorylate serine, the 
acceptor specificities of most of the eukaryotic protein 
kinases of this type are not known. In addition to these 
protein kinases, the Bcr protein encoded by the breakpoint 
cluster region gene involved in the Philadelphia chromo- 
some translocation (15) and the A6 kinase isolated by 
expression cloning using an anti-phosphotyrosine anti- 
body (16) have kinase domains unrelated to any known 
eukaryotic or prokaryotic kinase. In addition, true pro- 
tein-histidine kinases are known in eukaryotes. One such 
enzyme has been extensively characterized from budding 
yeast but not yet molecularly cloned (17), and so it is not 
clear whether this enzyme will belong to the protein ki- 
nase superfamily or use a novel structural principle for 
phosphotransfer. 

What about the prokaryotes? It has been known for 
years that protein phosphorylation events play key regu- 
latory roles in numerous bacterial cell processes including 



chemotaxis, bacteriophage infection, nutrient uptake, 
and gene transcription (reviewed in refs 18, 19). The 
bacterial protein kinases have been divided into three 
general classes (20): 1) protein-histidine kinases such as 
those functioning in two-component sensory regulatory 
systems (strictly speaking, these are protein-aspartyl ki- 
nases, because autophosphorylation on His is an interme- 
diary step in phosphotransfer to an aspartate in the 
response-regulator protein) (9); 2) phosphotransferases 
such as those of the phosphoenol pyruvate-dependent 
- - 1 ' — sferase system involved in sugar uptake (21); 



and 3) protein-serine kinases such as isocitrate dehydro- 
genase kinase/phosphatase (22). Amino acid sequences 
have been determined for members of each class, and all 
are unrelated to the eukaryotic protein kinase superfa- 



itly, however, true homologs of the eukaryotic 
protein kinases have been identified from two species of 
bacteria, Yersinia pseudotuberculosis (7) and Myxococcus xan- 
thus (6, 23). Are these special cases, or the first examples 
of many such genes in prokaryotes? The eukaryotic-like 
protein kinase YpkA from the pathogenic enterobacteria 
Y. pseudotuberculosis is encoded by a plasmid essential for 



the virulence of this infectious organism. In addition to 
YpkA, at least two other proteins encoded by genes resid- 
ing on the virulence plasmid exhibit high similarity to 
eukaryotic proteins. Thus, it seems likely that the viru- 
lence plasmid genes were transduced from a eukaryotic 
host by horizontal transfer. The myxobacterium M. xan- 
thus presents a different and perhaps more intriguing 
picture. Application of the PCR homology-based cloning 
strategy revealed that at least eight genes encoding mem- 
bers of the eukaryotic protein kinase superfamily are pre- 
sent in the genome of this species (23). The mycobacteria 
are unusual prokaryotes in that they undergo a complex 
developmental cycle upon nutrient depletion, much like 
that of the eukaryotic slime mold DictyosUlium. Given that 
protein kinases are commonly involved in regulating 
growth and differentiation of eukaryotic cells, it is attrac- 
tive to speculate that the eukaryotic-like protein kinases 
in Af. xantkus are specifically involved in regulating their 
developmental cycle. Indeed, one of these kinases, Pknl, 
was shown to be required for proper fruiting body forma- 
tion. The same could be true for the eukaryotic-like pro- 
tein kinase PknA from Anabena (24). In keeping with this 
idea, neither the PCR approach' applied to Escherichia coli 
(23) nor extensive sequencing of the £ coli genome (now 
30% complete) has yielded eukaryotic-like protein ki- 
nases. Hence, genes encoding members of the eukaryotic 
protein kinase superfamily may be present only in bacte- 
ria that can undergo a developmental cycle. However, 
unpublished reports of eukaryotic-like protein kinases in 
Streptomyces coelicolor, and in three species of Methanococ- 
cus, suggest that such genes are more widely expressed 
among prokaryotes, and potentially these genes represent 
the ancestors for the entire eukaryotic protein kinase su- 
perfamily. 



THE HOMOLOGOUS KINASE DOMAINS 

The kinase domains of eukaryotic protein kinases impart 
the catalytic activity. Three separate roles can be ascribed 
to the kinase domains: 1) binding and orientation of the 
ATP (or GTP) phosphate donor as a complex with diva- 
lent cation (usually Mg 2 * or Mn 2 *); 2) binding and orien- 
tation of the protein (or peptide) substrate; and 3) 
transfer of the Y-phosphate from ATP (or GTP) to the 
acceptor hydroxy! residue (Ser, Thr, or Tyr) of the pro- 
tein substrate. 



s of primary structure 



The total number of distinct kinase domain amino arid 
sequences available is now approaching 400 (Table 1). 
Included in this total are the vertebrate enzymes encoded 
by distinct paralogous genes, their presumed functional 
homologs from invertebrates and simpler organisms (en- 
coded by orthologous genes), and those identified from 
lower organisms and plants for which vertebrate equiva- 
lents have not been found. Conserved features of kinase 
domain primary structure have previously been identified 
through an inspection of multiple amino acid sequence 
alignments (1-3) . The large number of sequences now 
available precludes showing an alignment containing all 
known kinase domains. Thus, in Fig. 1 only 60 different 
kinase domain sequences are aligned. These are drawn, 
however, from the widest possible sampling of the super- 
family and thus provide a good representation of the 
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G Group 

AGC-I. Cyclic nucleoode-regulated protein kinase family 
A. Cyclic AMP-dependent protein kinase (PKA) subfamily 



1. PKA-Co: 

2. PKA-CP: 
S.PKA-Cr 

1. DmPKACO: 

2. DmPKACl: 

3. DmPKA-C2: 



1. CePKA: 
Sacckaromyces cerexrisiae: 
1. ScPKA-Tpkl: 
Schizosaccharomyces pombe: 
1. SpPKAl: 



1.1 



PKA catalytic subunit, alpha-form 
PKA catalytic subunit, beta-form 
PKA catalytic subunit, gamma-form 

PKA catalytic subunit, CO form 
PKA catalytic subunit, CI form 
PKA catalytic subunit, C2 form 



Aplysia califomica: 

1. AplC: 

2. Sak: 

B. Cyclic GMP-dependent protein kinase (PKG) subfamily 



PKA catalytic subunit homolog, type 1 
PKA catalytic subunit homolog 
PKA catalytic subunit 
PKA catalytic subunit homolog 



1. DmPK&Gl: 

2. DmPKC-G2: 



AGC-II. DiacylglycerotacUvated/phosphoUpid-dependent protein kinase C (PKC) fa 
A. "Conventional" (Ca** -dependent) protein kinase C (cPKC) subfamily 



1. cPKCo: 

2. cPKCP: 

3. cPKCy: 



1. DmPKC-53Ebr: 

2. DmPKC-5SEcy: 
Aplysia califomica: 

l.ApM: 

B. "Novel" (Ca -independent) Pi 
vertebrate: 

1. nPKCS: 

2. nPKCe: 
S. nPKCn: 
4. nPKCB: 

Drosophila nulamgaster 
1. DmPK&98F: 



1. ApMI: 
Caenorhabditis elegans: 
1. CePKC: 

• 2. CePKClB: 
DictyosUlium discoideum: 

* 1. DdMHCK: 
Sacchammyces cerexri 

1. ScPKAl: 



Protein Kinase C, alpha-form 
Protein Kinase C, beta-form 
Protein Kinase C, g 



PKC homolog, locus 98F 
PKC homolog, type n 



C. "Atypical" Protein Kinase C (aPKC) subfamily 



"Pombe Ckinase", type 2 



1. aPKC?: 

2. aPKCi: 
4. aPKCu: 



Protein Kinase C, zeta-form 
Protein Kinase C, iota-form 
Protein Kinase C, mu-form 



•More information about the individual protein ki 
consulting The Protein Kinase Factsbook (42). Protei. 



iscovery. In many instance 
the entry and alternative 



n be obtained by contacting the authors or by 
ts (•) were not included in the phylogenetic analysis due to their 
m were cloned by more than one group; in these cases the most cot 
----- TiDNAvi 
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Table I. (continued). 



Protein kinase with PKC-related catalytic di 
AGC-III. Related to PKA and PRC (RAC) family 
Kate: 

1. RAC-CC 

2. RAC0: 



l.CeRAC: RAChomolog 
ACG-IV. Family of kinasese that phosphorylate G protein-coupled receptors 
hmU: 

1. 0ARK1: p-adrenergic receptor kinase, type 1 

2. 0ARK2: ^adrenergic receptor kinase, type 2 

S.RhK: Rhodopsin-- 
4.IT11: 

5. GRK5: 

6. CRK6: 



1. DroGPRKl: Drosophila G-protein-coupled receptor kinase, type 1 

2. DmGPRK2: Drosophila G-protein-coupled receptor kinase, type 2 
AGC-V. Family of budding yeast AGC-related kinases 



Suppressor of defects in cAMP effector pathway 
AGC-related kinase 



2.Ykr2: 

S.Ypkl: % AGC-related 
AGO VI. Family of kinases that phosphorylate ribosomal S6 protein 

1.S6K: 70 kDa S6 kinase with single catalytic domain 

2. RSKl(Nt): 90 kDA S6 kinase, type 1 

3. RSK2(Nt): 90 kDA S6 kinase, type 2 

[Note: The RSK enzymes have two distinct catalytic domains. The Nt-domain is closely related to S6K, w 
Ct-domain is most closely related to phosphorylase kinase] 
AGC-Vn. Budding yeast Dbf2/20 Family 
Saccharomyces cerevisiae: 

1. Dbf2: Product of gene periodically expressed in cell cycle 

2. Dbf20: Close relative of DBF2 not under cell cycle control 

AG- VIII. Flowering plant "PVPK1 Family* of protein kinase homologs 
Phylum Angiospermophyta (Kingdom Plantae): 
1. PvKl: 



2. OsCl 1 A: Rice protein kinase homolog 

S. ZmPPK: Maize protein kinase homolog 

4. AtPK5: Arabidopsis protein kinase homolog 

5. AtPK7: Arabidopsis protein kinase homolog 

6. AtPK64: Arabidopsis protein kinase h 

7. PsPK5: Pea protein kinase homolog 



Other AGC-related kinases 




"Myotonic Dystrophy Protein Kinase" 
"Serum and glucocortocoid regulated kinase" 
Spermatid "Microtubule-associated serine/threonine kit 

Product of gene required for normal colonial growth 

Product of developmentally-regulated gene 



Phylum Angiospermophyta (Kingdom Plantae): 
* l.Atpkl: Arabidopsis protein kinase 

CaMK-I. Family of kinases regulated by Ca 1 */ Calmodulin, and close relatives 
A Subfamily including "Multifunctional" CaVCalmodulin Kinases (CaMKs) 

CaMK, type I 

CaMK, type II, alpha subunit 
i. CaMK2p: CaMK. type II, ben subunit 

i. CaMK2r CaMK, type II, gamma subunit 

5.CaMK25: CaMK, type II, delta subunit 

" " " " ption Factor-2 Kinase or CaMK type II 
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Table 1, {continued). 



1 . ScCaMK2-l : CaMK-II homolog, product of CMK1 gene 

2. ScCaMK2-2: CaMK-II homolog, product of CMK2 gene 
Aspergillus nidulans: 

1 . AnCaMK2: CaMK-II homolog 

B. Subfamily including phosphorylase kinases 



1 . PhK-yM: Skeletal muscle phosphorylase kinase catalytic subunit 

2. PhK-yT: Male germ cell phosphorylase kinase catalytic subunit 

3. RSKl(Ct): 90 kDa S6 kinase, type 1; C-terminal catalytic domain 

4. RSK2(Ct): 90 kDa S6 kinase, type 2; C-terminal catalytic domain 
C. Subfamily including myosin light chain kinases 

'mite: 

1 . skMLCK: Skeletal muscle MLCK (rabbit) 

2. smMLCK: Smooth muscle MLCK (rabbit) 

3. Titin: Huge protein implicated in skeletal muscle development 
othabditis etegans: 

1 . Twn: "Twitchin" protein involved in muscle contraction or development 



1. DdMLCK: Slime mold myosin light chain kinase 

D. Subfamily of plant kinases with intrinsic calmodulin-ttke domain 
Phylum Angiospermophyta (Kingdom Plantae): 

1 . CDPK: Soybean Ca**-regulated kinase with intrinsic CaM-like domain 

2. AtAKl: Arabidopsis CDPK homolog 

• 3. OsSpk: Rice CDPK homolog 

• 4. DcPk431 : Carrot CDPK homolog 

E, Subfamily of plant kinases with highly acidic domain 
Phylum Angiospermophyta (Kingdom Plantae): 

• 1. ASK1: Arabidopsis protein kinase homolog with highly acidic idomain 
Arabidopsis protein kinase homolog with highly acidic domain 

d kinases 

1. PskHl: Putative protein-serine kinase 

2. MAPKAP2: "MAP Kinase-Activated Protein Kinase 2" 



Satcharomyces certvisiai 

1. Mre4: 
* 2. Dunl: 



3. Rckl: "Radiation sensitivity complementing kinase, type 1" 

4. Rck2: "Radiation sensitivity complementing kinase, type 2" 



CaMK-II. Snfl/AMPK family 



1: AMPK: 
2: p78: 

1. Snfl: Kinase essential for release from glucose repression 

2. Kinl: Protein kinase with N-terminal catalytic domain 

3. Kin2: Close relative of KIN1 

4. Ycl24: Protein kinase homolog on chromosome III 

* 5. Ycl453: Protein kinase homolog on chromosome XI 
Sehitosaccharomyees pombe: 

l.SpKinl: 
2. Niml: 

Phylum Angiospermophyta (Kingdom Plantae): 

1 . PSnfl-RKIN 1 : Rye putative protein kinase that complements yeast snfl pi 

2. PSnfl-AKJNlO: Arabidopsis putative protein kinase related to SNF1 

3. PSnfl-BKIN 12: Barley protein related to SNF1 

* 4. PKABA1: Wheat kinase induced by abscisic acid 

* 5. WPK4: Wheat kinase homolog regulated by light and nutrients 

* 6. NPKS: Tobacco Snfl homolog, activates SUC2 gene expression 



C-M-OC Group 

CMGC-I. Family of cyclin-dependent kinases (CDKs) and other close relatives 

Inducer of mitosis; functional homolog of yeast cdc2+/CDC28 kinases (Cdkl) 
Type 2 cyclin-dependent kinase 
Type 3 cyclin-dependent kinase 
Type 4 cyclin-dependent kinase 
Type 5 cydin-dependent kinase 
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6. Cdk6: 

7. PCTAIRE1: 

8. PCTATRE2: 

9. PCTAIRE3: 

10. Mol5: 



Cdc2-related protein 
Cdc2-related protein 



"Cdk-activating kinase"; Negative regulator of meiosis (CAK) 

Functional homolog of yeast cdc2+/CDC28 kinases 
Cdc2-cognate protein; Cdk2 homolog 

of yeast cdc2VCDC28 kinases 




Phylum Angiospermophyta (Kingdom Plant*): 



al homolog of y 
"Cdc2-relatedPCTAIRE 

Cdc2-rehued gene product 

Cdc2-related protein from human malarial parasite 

Cdc2-related protein 

Cdc2-related protein 

"Cdc2-Related Kinase" 

"OU-di vision-cycle" gene product 

Negative regulator of the PHO system and cell cycle regulator 
CDC28-related protein 

"Cell-division-cyde" gene product 

Cdc2 homolog from dimorphic fungus 



CMGC-II. Erk(MAP kinase) family 

■ IK 

l.Erkl: 



Howering plant Cdc2 homolog othat complements yeast mutai 
Alfalfa Cdc2 cognate gene products that complemenu Cl/S tr 
More distantly related Cdc2 homolog from rice 



"Extracellular signal-regulated kinase*, type 1 (p44 MAP kinase) 
"Extracellular signal-regulated kinase", type 2 (p42 MAP kinase) 
Somewhat distant relative of the Erk/MAP kinases 
Another more distant relative of the Erk/MAP kinases 
"Stress-activated protein kinase, type alpha" (JNK2) 
"Stress-activated protein kinase, type beta" 

"Stress-activated protein kinase, type gamma" or "Jun N-terminal Kinase" 
HOG 1 -related protein (MPK2) 

Homolog of Erk/MAP kinases; product of rolled gene 

Erk/MAP kinase homolog 



2FusS: 

3. Slt2: Product of gene complementing Iyt2 mutants (MPK1) 

* 4. Hogl : Product of gene required for osmoregulation 

1. Spkl : Product of gene that confers drug resistance to staurosporine, a PK in 

Phylum Deuteromyeota (Kingdom Fungi): 

1. CaErkl: Protein that interferes with mating factor-induced cell cycle arrest 

Trypanosoma brucei (Phylum Ztxmastigina, Kingdom Protoctisla): 

* 1.KFR1: -KSS1- and FUSS-related" 
Phylum Anguapemophyta (Kingdom Plantae): 

l.PErk: Flowering plant Erk/MAP kinase homologs (7 

CMGC-11I. Glycogen synthase kinase 3 (GSK3) family 



homologs (7 distinct homologs identified in Arabidopsis) 



1. GSKSofc 

2. GSK3B: 
Drosophila melanogasttr 

l.Sggt 
Saceharomjcet certvisiae: 
1. Mckl: 

* 2.ScGSK3 

* 3. Mdsl: 



Glycogen synthase kinase 3, a-form 
Glycogen synthase kinase 3, B-form 



Glycogen synthase kinase 3 homolog 
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1.CK20L 
1. CK2o': 
Drosophila mtlanogaster. 

1. DmCK2: 
Caenorkabditis tlegans: 

l.CeCK2: 
Theikria parva (a protozoan parasite): 
l.TpCK2: 



Casein kinase II, alpha subunit 



Casein kinase II a-subunit homolog 



1. DdCKS: 
Sauharomyc* cerevisiae: 

1. ScCK2a: 

2. ScCK2tf: 
Schizotaccharomyca pombe: 
• 1. SpCkal: 



CMCC-IV. Ok family 




Casein kinase II, (Wubunit homolog (Orb5) 
Flowering plant casein kinase n, a-subunit homolog 



Kinase encoded by "Darkener of Apricot" locus 
Suppressor of HAS m 



inkina 



1. Dskl: 
• 2. Prp4: 
Other CMGC Group kinases 



Disln 



1. Mak: 

2. Ched: 

3. PITSLRE: 

4. KKIALRE: 

• 5. PITALRE: 

• 6. P1SSLRE: 
Saccharomyces cmvisiat: 

l.Smel: 
2.Sgvl: 
3. Ctkl: 

Phylum Angiaspermophyla (Kingdom Pi 

• 1. Mhk: 



Protein-Tyrosine Kinase Group (I-X: \ 
PTK-I. Src family 



Pre-mRNA processing gene product; lacks subdomains X XI 
"Male germ celtassodated kinase" 



protein 
kinase 



1. Src: 

2. Yes: 

3. Yrk: 

4. Fyn: 

5. Fgr 

6. Lyn: 

7. Hck: 

8. Lck: 



Product of gene essential for itart of mejosis 
Kinase required for G-protein-mediated adaptr 
Product of gene required for normal growth 

iat): 

Arabidopsis thaliana "Mak homologous kinase" 
Non-membrane-spanning: XI-XXIII: Membrane-spanning) 

Cellular homolog of Rous sarcoma vi 

CeUular homolog of 

Yes-rebted kinase 
Protein related to Fgr and Yes 
Cellular homolog of Gardner-Rasheed sa 
Protein related to Fgr and Yes 
Hematopoietic cell protein-tyrosine kinai 
Lymphoid T<ell protein-tyrosine kinase 
Lymphoid B-cell protein-tyrosine kinase 
Fyn-related kinase 
STK-related kinase 

"Fyn and Yes-related kinase" from electri 
Src homolog, polytene locus 64B 



Hydra vulgaris (Phylum Cnidaria): 
I.Stk: 

Spongilla lacustris (Phylum Pmfera): 
l.SrkM: 
PTK-II.Brk family 
irate 
l.Brk: 



Src-related protein 

Four distinct Src-related kinases 
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* 4.Txk: 
DrosopkUa mdanogasUr: 
1. DmTec: 
ITX.fV.Csk family 



l.Csk: 
• 2. MatK: 
PTK-V. Fes(Fps) family 



"Expressed mainly in TeeUs" kinase (Itk, Tsk) 
"Bruton's agammagtobuUiuwmia tyrosine kinasi 
Tec-related protein-tyrosine kinase 

Tec homolog, polytene locus 28C 



"C terminal Src Kinase"; negative regulator of Src 
"Megalcary«7te*ssociaied Tyr-kinase" (Hyl, Lsk, Ctl 



Cellular homolog of feline and avian sarcoma viruses 
"Fes/Fps-related" kinase 



Dnaophila mtlanogasUr. 

1. DmAbl: 
Caenorhabditis eUgans: 
l.CeAbl: 
PTK-V11. Syk/Zap70 family 



1. Syk: 

2. Zap70: 

Hydra vulgaris (Phylum Cntdaria): 
* 1. Htkl6: 
PTK-VniJak family 



1. Tyk2: 

2. Jakl: 
S.Jak2: 

* 4.Jak3: 
Dnaophila mtlanogaster 

* 1. Hop: 
FTK-K. Ack 



Syk/Zap70-related 



Transducer of interferon a/p si| 
"Janus kinase", type 1 
"Janus kinase", type 2 
"Janus kinase", type 3 



Product of hopscotch gene required for establ 



l.Fak: 

FTK-XI. Epidermal growth factor receptor fa 



Epidermal growth factor receptor 

Cell homolog of oncogene activated in ENU-induced rat neuroblastoma (Neu, HERS) 
Receptor tyrosine kinase related to EGFR (HERS) 
Receptor tyrosine kinase related to EGFR (Tyro2) 



ni (Phylum PtatyhelmmOus): 



Product of gene required for m 



•XII.Eph/FJk/Eckr« 
vertebrate: 



1. Eph: 

2. Eck: 

5. Eek: 
4.Helc 
5.Sek: 

6. Elk: 

7. Hek2: 
8-Htk: 

9. Cek5/Nuk: 

10. Ehkl: 

11. Ehk2: 

12. Mykl: 



EOF receptor homolog 



Kinase detected in "erythropoeituvproducing hepatoma" 



Eph/ Elk-related protein-tyrosine kinase 
Eph/Elk related protein-tyrosine kinase (Cek4) 



"EpMike kinase" detected in brain 
"Human embryo kinase" type 2 (CeklO) 
"Hepatoma transmembrane kinase" 
"Chicken embryo kinase 5 "/"Neural kinase" 
"Eph homology kinase- 1" (Cek7) 

1 lnase-2" 

d tyrosine kinase, type 1* 
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16. Rtkl: 

17. Rtk2: 

18. Rtk3: 
PTK-XIII. Axl family 



" Mammary-derived tyrosine kinase, type 2" 
"Chicken embryo kinase 9* 
"PagUacdo" Xenopus protein expression in neural c 
" 1 "to Eph/Elk-related protein-tyrosine kinase 



1. Axl: 

2. Eyk: 

» 3. Brt/Sky/Tif/Rse: 
PTK-XIV. Tie/Tek family 



"Anexelekto" (Gr. "uncontrolled') tyrosine kinase (UFO, Ark) 



"Tyrosine kinase with Ig 



and EGF homology* 
Jial cell kinase" (T1E2) 



PTK-XV. Platelet-derived growth factor receptor family 
A. Subfamily witih 5 Ig-like extracellular domains 



1. PDGFRa 

2. PDGFRfl: 

3. CSF1R: 

4. Kit: 

5. Flk2: 



1. Fill: 

2. Flt4: 

3. Flkl: 



Platelet-derived growth factor receptor, type alpha 
Platelet-derived growth factor receptor, type beta 
Colony-stimulating factor- 1 receptor (c-Fms) 
Steel growth factor receptor 
•Fetal liver kinase-2" (Flt3) 



"FmsJike tyrosine kinase", type 1 
"Fms-like tyrosine kinase", type 4 
"Fetal liver kinase-1" (KDR) 



PTK-XV1. Fibroblast growth factor receptor family 



1. FGFR1: 

2. FGFR2: 

3. FGFRS: 

4. FGFR4: 



PTK-XVII. Insulin receptor family 



Fibroblast growth factor receptor, type 1 (Fig, Cekl) 
Fibroblast growth factor receptor, type 2 (Belt, K-SAM, CekS) 
Fibroblast growth factor receptor, type 3 
Fibroblast growth factor receptor, type 4 

Fibroblast growth factor receptor homolog, type 1 
Fibroblast growth factor receptor homolog, type 2 



Homolog of insulin receptor 



PTK-XK. Ros/Sev family 
1. Ros: 



l.Sev: 

PTK-XX.Trk/RorfamUy 

1. Trk: 

2. TrkB: 

3. TrkC: 

4. Rorl: 

5. Ror2: 

6. TcRTK: 



l.Dror 
PTK-XXI. Ddr/Tkt family 

* 1. Ddr 

• 2. Tkt: 



"Anaplastic lymphoma kinase 



Cellular homolog of UR2 avian sarcoma virus oncoprotein 
required for R7 i 



"Ror" putative receptor, t . 
"Ror* putative receptor, type 2 
Trk-related receptor (electric ray) 
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Hepatocyte growth factor receptor (MET) 
CeUular homolog of 



PTK-XXII. Hepatocyte growth factor receptor family 

tocvte ero' 

>gofS13; 
spteur dOrigine 
* 4. Suu "Stem cell-derived tyr< 

PTK-XX1II. Nematode Kin 15/ 16 family 
rrhabditis elegans: 

PTK expressed during h; 



PTK expressed during hypodermal development 
Other membrane-spanning protein-cyrosine kinases (each with no close relatives) 
■note: 

1 . Ret: Normal homolog of oncoprotein activated by recombination 

2. Klg: "Kiiuue-tike gene" product 

S. Nyk/Ryk: "Novel tyrosine kinase-related protein" (VIK, Mrk, Nbtkl) 

Product of tone gene required for embryonic anterior/posterior determination 
Distant relative of the mammalian trk gene 



sr protein Id 

O-I. Polo 



l.GCTK: 

Me fiuniliea (not falling into m ajor groups) 



1. Plk: 

2. Snk: "Sen 



3. Sak: Polo-related kinase isolated in screen for genes regulating sialylation 

Protein kinase homolog required for mitosis 



1 . Cdc5: Product of gene required for cell cycle progression 

Oil. MEK/STE7 family 
vertebrate: 

1. MEK1: "MAP ERK Kinase", type 1 

2. MEK2: "MAP ERK Kinase", type 2 
Dmsophila meianogaster. 

1. Dsorl: 
Sacckarom^cemnstae. 

Kinase required for haploid-specific gene expression 

2. Pbs2: Kinase required for antibiotic drug resistance 

3. Mkkl: "MAP Kinase Kinase", type 1 (suppresses lysis defect of pkcl mutant) 

4. Mkk2: "MAP Kinase Kinase", type 2 (suppresses lysis defect of pkcl mutant) 



1. Byrl: 

2. Wisl: 

I. MEKK/Stel 1 family 



* 1. MEKK: 
Sacchartmyesjt 



2. Bckl: "Bypass of C kinase" ki 



Suppressor of cdc phenotype in 



Protein required for cell-type-specific transcription 



1. Byr2: Product of gene required for pheromone signal transduction 

Phylum Angiospermophyta (Kingdom Ptantae): 

* 1.NPK1: nowering plant (tobacco) homolog of Bckl 

O-IV. Pak/Ste20 family 



* l.Pak: "p2MCdc42/Rac) activated kinase" 

Product of gene required for pheromone re 
O-V.NiirAramily 
- \mu: 

1. Nekl: 

2. Nek2: 

3. Nek3: 

4. NA2: 

5. Stkl: 



Cell cycle control protein kinase 
Product of gene required for segm 
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ie protein kinase related to NimA 



l.KinS: 
O-Vl. weel/mikl family 
verttbraU: 

l.WeelHu: 
Saccharomyca cerevisiae: 

• 1 . Swel: Weel homolog from budding yeast 



"Wee* size at division kinase; Cdc2 negative regulator 
"Mitosis inhibitory kinase", negative regulator of Cdc2 



O-VIl. Family of kinases involved in tnuulationa) control 



"Heme-regulated eukaryooc initiation factor 2a kinase" 
"Double-stranded RNA-dependem kinase" (Tik) 



1. Ccn2: Protein required for tran 

O-VHl.Raf family 



1. Raf-1: 

2. A-Raf: 

3. B-Raf: 
Drosophila meUmogaster. 

1. DmRaf: nai nomoiog 

1 . CeRaf: Raf homolog; product of lin-45 gene required for vulval difFere 

Phylum Angiospermophyta (Kingdom Ptantae); 

l.Ctrl; Negative regulator of ethylene response pathway 

O-IX. Acuvin/TGFB receptor family 
A. Subfamily of type I receptors 

1. ActR-I: Type I receptor for activin and TGF-B (Tsk7L, SKR1, ALK-2) 

2. TSR-1: Type I receptor for activin and TGFG-P(ALK-l) 

3. TGFBRI: Type 1 receptor TGF- (ALK-5) 

4. ActR-IB: Type I receptor for activin (ALK4) 

5. BRK-1 : Type I receptor for BMP-2 and BMP-4 (ALK-3) 

6. ALK-6: "Activin receptor-like kinase*, type 6 



1. DmAtr-I: 

* 2. DmSax: 

B. Subfamily of type II receptors 

1 . ActRII: Type II receptor for activin 

2. ActRIIB: Type II receptor for activin 

3. TGFpRIl: Type II receptor TGF-8 

* 4. C14: Putative receptor kinase expressed in gonads 

Drosophila meUmogaster: 

* 1 . DmAtr-II: Type II activin receptor homolog 
Camorhabditis elegant: 

* l.DAF-4: Larva development regulatory protein; BMP receptor 

C. Others 



Product of gene required for vulval development 
O-X. Flowering plant putative receptor kinase family 
Phylum Angiospermophyta (Kingdom Plantae): 

1. ZmPKl : Putative receptor protein-serine kinase (maize) 

2. Srk: *S receptor kinase"; three distinct alleles: 2, 6, and 910 (Brassica) 



3. Tmkl: Putative "Transmembrane receptor kinase" (Arabidopsis) 

4. Apkl : Kinase that phosphorylates Tyr, Ser, and Thr (Arabidopsis 

* 5. Nak: "Novel Arabidopsis Kinase" (Arabidopsis) 
6. Pro25: Putative kinase selected for specificity to thylakoid membrane protein (Arabidopsis) 

* 7. Pto: Product of genen conferring pathogen resistance (tomato) 

* 8. Tmkl 1 : Transmembrane protein with unusual kinase-like domain (/ 

* 9. Prkl: Pollen-expressed receptor-like putative kinase (Petunia) 
O-XI. Family of "mixed-lineage" kinases with leucine zipper domain 
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O-m Casein kinase I ramily 



1. CKlou 

2. CKIB: 
S. CKlr 
4. CK16: 



Saceharomycts > 
l.Yckl 



Casein kinase I, type alpha 



Budding yeast casein kinase I homolog, type 1 
Budding yeast casein kinase I homolog, type 2 
Kinase required for DNA repair 



Fission yeast casein kinase I homolog, type 1 
Fission yeast casein kinase I homolog, type 2 



jo eukaryodc kin 



1. Hhpl: 
* 2. Hhp2: 
O-Xni. PKN family of prokaryooc protein kinases 

Myxococcus xanthus (Phylum Myxobacteria: Kingdom Prokaryotae): 

1. Pknl: Protein kinase homologous to 

2. Pkn2: Protein kinase required for maintenance of stationary phase cells and development 
Other protein kinase family members (each with no known close relatives) 



Kinase expressed in germinal « 
STE20-related kinase 
UM motif-containing ki 




Product of gene essential for photoreceptor function 
Product of gene required for dorsalventral polarity 
Product of gene required for rotation of photoreceptor cl 



1. PhyCen 
Saeehanm^ocmnsiae. 

2. CDC15: 

3. Vpsl5: 

4. Nprl: 

5. Elml: 
6. Irel: 

7. YM516: 

8. Ipll: 



re proteuvty 



te kinase encoded by a photochrome gene 



"Cell-division-cycle* control gene product 
" Cell-division-cycle" control gene product 
Product of gene essential for sorting to lysosome-like vacuole 
Product of gene required for activity of ammonia-sensitive amir 
Product of gene required for yeast-like cell morphology 
Required for Myoinositol synthesis and signaling from ERtoth 
Putative protein kinase gene on chromosome XI 
Product of gene required for chromosome segregation 



a, KripbmPntt 



1. Rani: 

2. Chkl: 

♦ S.Cskl: 

• 4. RPK1: 

Entamoeba histolytica (Phylum Rhuopoda, K 

1. Ehmfkl: Distant relative of Mos 

Phylum Anghspermophyla (Kingdom Plantae): 

1 . GmPK6: Protein kinase homolog (soybean) 

* 2. Tsl: Product of Tousled gene required for normal leaf/flower development (Arabidopsis) 
Yersinia pseudotuberculosis (Phylum Omnibaeteria, Kingdom Prokaryotae): 

1. YpkA: Enterobacterial protein kinase essential for virulence 



known primary structures. The kinase domains are fur- 
ther divided into 12 smaller subdomains (indicated by 
Roman numerals), defined as regions never interrupted 
by large amino add insertions and containing charac- 
teristic patterns of conserved residues (consensus line in 

Flfrl). 

Twelve kinase domain residues are recognized as being 
invariant or nearly invariant throughout die superfamily 
(conserved in over 95% of S70 sequences), and hence 
strongly implicated as playing essential roles in enzyme 



function. Using the type o cAMP-dependent protein ki- 
nase catalytic subunit (PKA-Ca) as a reference point, 
these are equivalent to Gly50 and Gly52 in subdomain I, 
Lys72 in subdomain II, Glu91 in subdomain in, Asp 166 
and Asnl71 in subdomain VIB, Asp 184 and Glyl86 in 
subdomain VII, Glu208 in subdomain VIII, Asp220 and 
Gly225 in subdomain IX, and Arg280 in subdomain XI. 

The patterns of amino acid residues found within sub- 
domains VIB, VIII, and IX have been particularly 
well-conserved among the individual members of the dif- 
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Figure 1. Multiple alignments of 60 kinase domains representative of members of the eukaryotic protein kinase superfamily. The 
abbreviated names used are as defined in Table 1. The single letter amino acid code is used and gaps are indicated by dashes. The 
entire sequences for the larger inserts are not shown, but excluded residues arc indicated as numbers in brackets. Twelve distinct 
subdomains are indicated by Roman numerals. The consensus line is given according to the following code: uppercase letters, invariant 
residues, lowercase residues nearly invariant residues; o, positions conserving nonpolar residues; *, positions conserving polar 
residues; +, positions conserving small residues with near neutral polarity. Residues corresponding to the numbered B-strands (b) 
and oc-helices (a) in PKA-Cot are indicated in the 2 • structure line. 



ferent protein kinase families and these motifs have been 
targeted most frequently in PCR-based homology clon- 
ing strategies aimed at identifying new family members. 
Relationship between conserved subdomains, higher 
order structure, and catalytic mechanism 
The homologous nature of the kinase domains implies 
that they all fold into topological^ similar 3-dimensional 
core structures and impart phosphotransfer according to 
a common mechanism. The larger inserts found within 
some kinase domains are likely to represent surface ele- 
ments that do not disrupt the basic core structure. With 
the solution of the crystal structure of mouse PKA-Ca, in 
a binary complex with a pseudosubstrate peptide inhibi- 
tor (PKI 5-24; TTYADFIASGRTGRRNAIHD, the under- 
lined Ala substituting for the Ser phosphoacceptor), the 
general topology of a protein kinase catalytic core struc- 



ture was revealed for the first time (25, 26). Later, struc- 
tures of ternary complexes of PKA-Ca, the 
pseudosubstrate inhibitor, and either MgATP or 
MnAMP-PNP (an MgATP analog) were solved (27, 28). 
As a consequence of these studies, precise functional 
roles for most of the highly conserved kinase domain 
residues have now been assigned. 

The kinase domain of PKA-Ca folds into a two-lobed 
structure (Fig. Z). The smaller, NH2- terminal lobe, which 
includes subdomains I-IV, is primarily involved in an- 
choring and orienting the nucleotide. This lobe has a 
predominantly antiparallel (J-sheet structure that is 
unique among nucleotide binding proteins. The larger 
COOH-terminal lobe, which includes subdomains 
VTA-XI, is largely responsible for binding the peptide 
substrate and initiating phosphotransfer. It is predomi- 
nantly a-helical in content. Subdomain V residues span 
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Figure 1 (contd.). 



the two lobes. The deep cleft between the two lobes is 
recognized as the site of catalysis. The crystal structures 
of four additional eukaryotic protein kinase superfamily 
members-cyclin-dependent kinase 2 (Cdk2) (29), p42 
MAP kinase (Erk2) (30), twitchin kinase (31), and casein 
kinase I (32)— have been reported more recently, and as 
expected, their kinase domains were found to fold into 
two-lobed structures topologically very similar to the 
catalytic core of PKA-Cct. Notable differences, however, 
were found in the regions corresponding to subdomain 
VIII in the Cdk2 and Erk2 structures, apparendy reflect- 
ing the fact that these are structures of enzymes in an 
inactive state (see below). The twitchin structure is also of 
an inactive enzyme, but in this case it is inactive due to 
the presence of an autoinhibitory peptide sequence, 
which lies on the COOH-terminal side of the kinase do- 
main and folds back into the active site cleft between the 
two lobes (31). This peptide apparendy forces the two 



lobes to rotate almost 30° with respect to one another, 
and in this configuration inactive twitchin is more similar 
to the open configuration of PKA-Cct without PKI (33). 
In both twitchin and Cdk2 the a-heiix C in subdomain 
III also adopts a different position to that of helix C in 
PKA-Cot. Unfortunately, no structure of a protein-tyro- 
sine kinase catalytic domain was available at the time of 
writing (see "Note added in prooP), but the ease with 
which it has been possible to model die kinase domain of 
the EGF receptor protein-tyrosine kinase on to that of 
the PKA-Co emphasizes that the structure of the pro- 
tein-tyrosine kinases will be similar to that of the pro- 
tein-serine kinases (34) 

The conserved kinase subdomains correspond quite 
well to precise units of higher order structure. The func- 
tions of the individual subdomains will be discussed 
briefly later on a subdomain-by-subdomain basis, mak- 
ing reference to the crystal structure of PKA-Ca and 
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drawing attention to the proposed roles of the nearly 
invariant amino acid residues (25-27, 28) and other resi- 
dues of interest. For more detailed information, the 
reader is referred to recent reviews on the structure of 
PKA-Ca (35-37) and to an excellent comparative review 
of the structures of PKA-Co, Erk2, and Cdk2 (38). 

Subdomain I, at the NH2 terminus of the kinase do- 
main, contains the consensus motif Gly-x-Gly-x-x-Gly- 
x-Val (starting with Gly50 in PKA-Ca). The kinase do- 
main NHg-terminal boundary occurs seven positions up- 
stream of the first glycine in the consensus, where a 
hydrophobic residue is usually found. Subdomain I resi- 
dues fold into a B-strand-turn-B-strand structure encom- 
passing B-strands 1 and 2, and this structure acts as a 
flexible flap or clamp that covers and anchors the non- 
transferable phosphates of ATP. The backbone amides of 
Ser53, Phe54, and Gly55 form hydrogen bonds with ATP 
B- phosphate oxygens. Leu49 and Val57 contribute to a 
hydrophobic pocket that encloses the adenine ring of 
ATP. 



Subdomain II contains the invariant Lys (Lys72 in 
PKA-Ca), which has long been recognized as being essen- 
tial for maximal enzyme activity. This Lys lies within B- 
strand 3 of the small lobe, and helps anchor and orient 
ATP by interacting with the a- and B- phosphates. In 
addition, Lys72 forms a salt bridge with the carboxyl 
group of the nearly invariant Glu91 in subdomain III. 
Ala70 contributes to the hydrophobic adenine ring 
pocket. In PKA-Ca, B-strand 3 fa followed immediately 
by a-helix B, which, judging from the sequence align- 
ment, appears to be quite a variable structure among die 
protein kinases. Indeed, this a- helix is absent in the 
Cdk2 and Erk2 crystal structures. 

Subdomain III represents the large a- helix C in the 
small lobe. The nearly invariant Ghi residue (Glu91 in 
PKA-Ca) is centrally located in this helix and helps stabi- 
lize the interactions between Lys72 and the a- and B- 
phosphates of ATP. Subdomain IV corresponds to the 
hydrophobic B-strand 4 in the small lobe. This subdo- 
main contains no invariant or nearly invariant residues 
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Figure2. Ribbon diagram of the catalytic core of PKAct (residues 
40-300) in a ternary complex with MgATP and pseudosubstrate 
peptide inhibitor (PKI -5-24). Invariant or nearly-invariant resi- 
dues (Gly50, Gly52, Gly55, Lys72, Glu91, Aspl66, Asnl71, 
Asp 184, Glu208. Asp220, and Arg280) are indicated by dots along 
the ribbon diagram. Side chains are shown for Lys72, Asp 166, 
Asnl71, Aspl84, Glu208, and Arg280. B-strands and a-helices 
are indicated by flat arrow and helices, respectively, and are 
numbered according to Knighton et at. (26). The small arrow 
indicates the site of phosphotransfer with the Ala in PKI substi- 
tuting for the phosphoacceptor Ser in the true substrate. (Repro- 
duced, with permission, from Taylor et al. (36)). 



and does not appear to be directly involved in catalysis or 
substrate recognition. 

Subdomain V links the small and large lobes of the 
catalytic subunit and consists of the very hydrophobic 
P-strand 5 in the small lobe, the small ct-helix D in the 
large lobe, and an extended chain that connects them. 
Three residues in the connecting chain of PKA-Cct, 
Glul21, Vall23, and Glul27 help anchor ATP by forming 
hydrogen bonds with either the adenine or the ribose 
ring. Metl20, Tyrl22, and Vall23 contribute to the hy- 
drophobic pocket surrounding the adenine ring. Glul27 
also participates in peptide binding by forming an ion 
pair with an Arg in the pseudosubstrate site of the PKA 
inhibitor peptide. This represents the first Arg in the PKA 
substrate recognition consensus Arg- Arg-x-Ser* -Hydro- 
phobic. 

Subdomain VIA folds into the large hydrophobic ct-he- 
lix E that extends through the large lobe. None of the 



residues in helix E appear to interact directly with either 
MgATP or peptide substrate; hence this part of the mole- 
cule appears to act mainly as a support structure. Subdo- 
main V1B folds into the small hydrophobic P-strands 6 
and 7 with ah intervening loop. Included here are two 
invariant residues (Asp 166 and Asnl71 in PKA-Ca) that 
lie within the consensus motif His-Arg-Asp-Leu-Lys- 
x-x-Asn (HRDLKxxN). The loop has been termed the 
catalytic loop because Asp 166 within the loop has 
emerged as the likely candidate for the catalytic base, 
accepting the proton from the attacking substrate hy- 
droxy! group during an in- line phosphotransfer mecha- 
nism. Lysl68 in the loop (substituted by Arg in the 
conventional protein-tyrosine kinases) may help facilitate 
phosphotransfer by neutralizing the negative charge of 
the y-phosphate during transfer. The side chain of 
Asnl71 helps to stabilize the catalytic loop through hydro- 
gen bonding to the backbone carbonyl of Aspl66 and 
also acts to chelate the secondary Mg 2 * ion that bridges 
the a- and Y-phosphates of the ATP. The carbonyl group 
of Glul70 forms a hydrogen bond with an ATP ribose 
hydroxyl group. Glul70 also participates in substrate 
binding by forming an ion pair with the second arginine 
of the peptide recognition consensus. 

Subdomain VII folds into a p-strand-loop-b-strand 
structure, encompassing p-strands 8 and 9. The highly 
conserved DFG triplet, corresponding to Asp 184- 
Phel85-Glyl86 in PKA-Ca, lies in the loop that is stabi- 
lized by a hydrogen bond between Aspl84 and Glyl86. 
Aspl84 chelates the primary activating Mg 2 * ions that 
bridge the P- and y-phosphates of the ATP, and thereby 
helps to orient the y-phosphate for transfer. In Cdk2, 
P-strand 9 is replaced with a small ct-helix designated 
ctL12. However, it is unclear whether this helical charac- 
ter is maintained when Cdk2 is in its active conformation. 

Subdomain VIII, which includes the highly conserved 
Ala-Pro-Glu ('APE') motif (residues 206-208 in 
PKA-Ca), folds into a tortuous chain that faces the deft. 
Residues lying 7-10 positions immediately upstream of 
the APE motif are characteristically well-conserved 
among the members of different protein kinase families. 
The nearly invariant Glu corresponding to PKA-Ca 
Glu208 forms an ion pair with an invariant Arg (Arg280 
in PKA-Ca) in subdomain XI, thereby helping to stabilize 
the large lobe. 

Subdomain VIII appears to play a major role in recog- 
nition of peptide substrates. Several PKA-Ca subdomain 
VIII residues participate in binding the pseudosubstrate 
inhibitor peptide. Leul98, Cysl99, Pro202, and Leu205 
of PKA-Ca provide a hydrophobic pocket that accommo- 
dates the side chain of the hydrophobic residue at posi- 
tion +1 of the substrate consensus (He for the inhibitor 
peptide). Gly200 forms a hydrogen bond with the same 
lie residue. Glu203 forms two ion pairs with the Arg in 
the high-affinity binding region of the inhibitor peptide. 

Many protein kinases are known to be activated by 
phosphorylation of residues in subdomain VIII. In 
PKA-Ca, maximal kinase activity requires phosphoryla- 
tion of Thrl97, probably occurring through an intermo- 
lecular autophosphorylation mechanism (39). In the 
crystal structure, phosphate oxygens of phospho-Thrl97 
form hydrogen bonds with the charged side chains of 
Argl65, Lysl89, and the hydroxyl group of Thrl95, and 
thereby may act to stabilize the subdomain VIII loop in 
an active conformation permitting proper orientation of 
the substrate peptide. For members of the Erk (MAP) 
kinase family, phosphorylation of both a Thr and a Tyr 
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residue in subdomain VIII (mediated by members of the 
MEK kinase family) is required for activation. In the crys- 
tal structure determined for Erk2, these residues (Thrl83 
and Tyrl85) were not phosphorylated and thus the en- 
zyme was in an inactive state (unlike the PKA-Ca struc- 
ture). The unphosphorylated Tyrl85 is buried in a 
hydrophobic pocket, and interactions with Tyrl85 are 
apparently required to hold the enzyme in the inactive 
state. Mutation of Tyrl85, however, does not activate the 
enzyme, and so phosphorylation of Tyrl85 must also play 
a role in activation. Unphosphorylated Erk2 appears to be 
inactive because residues required for catalysis are not 
properly oriented, and because its conformation results 
in a partial steric block to substrate binding. During acti- 
vation of Erk2, Tyrl85 phosphorylation precedes Thrl83 
phosphorylation; therefore, binding of MEK to Erk2 may 
alter the conformation of the subdomain VIII loop, 
thereby exposing Tyrl85 for phosphorylation by MEK. 
Interaction of phospho-Tyrl85 with surface residues 
would then allow the subdomain VIII loop to adopt the 
active conformation (SO). Subsequent phosphorylauon of 
the exposed Thrl83 may activate the enzyme fully by 
promoting correct alignment of the catalytic residues. 
From the crystal structure of Cdk2, likewise in an inactive 
unphosphorylated state, the subdomain VIII loop appears 
to be in a conformation that would inhibit enzyme activity 
by sterically blocking the presumed protein substrate 
binding cleft (29). Phosphorylation of Thrl60 in the Cdk2 
subdomain VIII, mediated by M015 (CAK), presumably 
would act to remove this inhibition by stabilizing the loop 
in an active conformation similar to that found in 
PKA-Ca. Cyclin binding to the NH2-terminal lobe is also 
needed to activate Cdkl, and this may cause rotation of 
the NH2-terminal domain resulting in correct alignment 
of catalytic residues. 

Subdomain IX corresponds to the large a- helix F of 
the large lobe. The nearly invariant Asp corresponding to 
PKA-Ca Asp220 lies in the NH 2 -terminal region of this 
helix and acts to stabilize the catalytic loop by hydrogen 
bonding to the backbone amides of Argl65 and Tyrl64 
that precede the loop. Glu230 of PKA-Ca forms an ion 
pair with the second Arg of the peptide recognition con- 
sensus. PKA-Ca residues 235-239 are all involved in hy- 
drophobic interactions with the inhibitor peptide. 

Subdomain X is the most poorly conserved subdomain 
and its function is obscure. In the crystal structure of 
PKA-Ca, it corresponds to the small a-helix G that occu- 
pies the base of the large lobe. Members of the Cdk, Erk 
(MAP), GSK3, and Clk kinase families (the C-M-G-C 
group) all have rather large insertions between subdo 
mains X and XI, whose functional significance is presently 
unclear. Subdomain XI extends to the COOH-terminal 
end of the kinase domain. The most notable feature here 
is the nearly invariant Arg corresponding to Arg280 in 
PKA-Ca, which lies between a-helices H and I. The 
COOH-terminal boundary of the kinase domain is still 
poorly defined. For many protein-serine kinases, the con- 
sensus motif His-x-Aromatic-Hydrophobic is found be- 
ginning 9-13 residues downstream of the invariant Arg. 
For protein-tyrosine kinases, a hydrophobic amino acid 
lying 10 positions downstream of the invariant Arg ap- 
pears to define the COOH-terminal boundary. 

The amphipathic a-helix A of PKA-Ca (residues 
15-35; not shown in Fig. 2), though lying outside of the 
conserved catalytic core on the NH 2 - terminal side, ap- 
pears to be an important feature found in many protein 



kinases (40). This helix spans the surface of both lobes of 
the core structure and complements and stabilizes the 
hydrophobic cleft between the two lobes. The A-helix 
motif appears to be present in many other protein kinases 
including members of the protein kinase C family and the 
Src family of protein-tyrosine kinases (40). 

CLASSIFICATION OF EUKARYOTIC PROTEIN 
KINASES 

To facilitate analysis and management of this large super- 
family we have devised the classification scheme shown in 
Table 1, which subdivides the known members of the 
eukaryotic protein kinase superfamily into distinct fami- 
lies that share basic structural and functional properties. 
Phylogenetic trees derived from an alignment of kinase 
domain amino acid sequences (essentially an expanded 
version of Fig. 1) served as the basis for this classification. 
Thus, the sole consideration was similarity in kinase do- 
main amino acid sequence. When considered alone, how- 
ever, this property has been a good indicator of other 
characteristics held in common by the different members 
of the family. 

Protein kinases whose entire kinase domain amino acid 
sequence had been published by July 1993 were included 
in phylogenetic analysis (as well as a few others made 
available at that time through sequence databases). If a 
given kinase domain sequence had been determined from 
more than one species among the vertebrates (i.e„ or- 
thologous gene products), only one representative (usu- 
ally human) was included in the analysis. This policy was 
not used for the other phyla, however, because of greater 
divergences between the species and, hence, the se- 
quences. The kinase domain phytogenies were inferred 
using the principle of maximum parsimony according to 
the PAUP software package developed by Swofford (41). 
Minimum-length trees were found using PAUP's 'heuris- 
tic' search method with branch swapping by the 'tree 
bisection-reconnection* strategy. Equal weights were 
given for all amino acid substitutions. Because multiple 
minimum-length trees were found, a consensus tree was 
calculated according to the method of Adams (cited in ref 
41) in order to show branching ambiguities. 

To accommodate the large numbers of sequences, it 
was necessary to construct five separate trees. Initially, a 
skeleton tree of 99 kinases was obtained (Fig. SA). The 
skeleton tree included only representative members from 
each of four large groups of protein kinases, each consist- 
ing of multiple related families known from previous 
work to cluster together in the tree. These four groups 
are designated: 1) the AGC group, which includes the 
cyclic-nucleotide-dependent family (PKA and PK£), the 
protein kinase C (PKQ family, the ^-adrenergic receptor 
kinase (PARK) family, the ribosomal S6 kinase family, and 
other close relatives; 2) the CaMK group, which includes 
the family of protein kinases regulated by calcium/ cal- 
modulin, the Snfl/AMPK family, and other close rela- 
tives; 3) the CMGC group, which includes the family of 
cyclin-dependent kinases, the Erk (MAP) kinase family, 
the glycogen synthase 3 (GSK3) family, the casein kinase 
II family, the Clk (Cdk-like kinase) family, and other dose 
relatives; and 4) the 'conventional' protein-tyrosine ki- 
nase (PTK) group. Separate trees (Fig. SB-E) were later 
obtained for each of the four large kinase groups, and 
contain all members of the groups whose sequences were 
available at the time of analysis. 
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Figure S. Phylogenetk trees of the eukaryotk protein kinase 
superfamily inferred from kinase domain amino acid sequence 
alignments. The abbreviated nomenclature is the same used in 
Table 1. A) 'Skeleton' tree showing 99 protein kinases. Positions 
of 4 clusters (AGC, CaMK. CMGC, and PTK) containing protein 
kinases representative of larger groups are indicated in die skele- 
ton tree. B) AGC group tree of 59 protein kinases including PKA, 
PRC, and PKC and other close relatives. C) CaMK group tree of 
35 protein kinases including the calcium/calmodulin-regulated 
enzymes. D) CMGC group tree of 59 protein kinases including 
the cyclin-dependent kinases. E) PTK group tree of 90 conven- 
tional protein-tyrosine kinases. Tree A is unrooted and drawn 
with Pknl and Pkn2 as outgroups. Outgroups of two or more 
distantly related protein kinases (not shown) were included in the 
analysis of trees B-E to provide a rooting point. Asterisks (*) in 
all trees indicate branches leading to defined protein kinase 
families listed in Table 1. Branch lengths indicate number of 
amino acid substitutions required to reach hypothetical common 
ancestors at internal nodes. 



It can be reasonably surmised that the protein kinases 
having closely related catalytic domains, and thus defining 
a family, represent products of genes that have under- 
gone relatively recent evolutionary separations. Given 
this, it should come as no surprise that members of a 
given family tend also to share related functions. This is 
manifest by similarities in overall structural topology, 
mode of regulation, and substrate specificity. The details 
of the common properties exhibited by the members of 
the various kinase families can best be gleaned from 
studying the information outlined in the individual en- 
tries section of the Protein Kinase Factsbook (42). Some of 
the most salient relationships are discussed below. 

The AGC group protein kinases tend to be basic amino 
acid-directed enzymes, phosphorylating substrates at 
Ser/Thr residues lying very near Arg and Lys. For the 
cyclic nucleotide-dependent and ribosomal S6 kinase 
families, the preferred substrates have basic residues lying 
in specific positions NH2-terminal to the phosphate ac- 
ceptor. Preferred substrates for the PKC and RAC fami- 
lies have basic residues on both the NH2- and COOH- 
terminal sides of the acceptor (43). The G-protein-cou- 
pled receptor kinases (PARK and RhK) appear to break 
this rule, however, as they are reported to prefer synthetic 
peptide substrate residues located within an acidic envi- 
ronment. Little substrate information is available for the 
other families in this group. 



The CaMK group protein kinases also tend to be basic 
amino acid- directed, and in this regard it is notable that 
the AGC and CaMK groups fall near one another in the 
phylogenetic tree. CaMKl, CaMK2, CaMK4, MLCK, 
CDPK, and AMPK are all reported to prefer substrates 
with basic residues at specific positions NH2-terminal to 
the acceptor site, whereas EF2K and PhK prefer sites with 
basic residues at both NH2- and COOH-terrninal loca- 
tions. Many, but not all, of the CaMK group protein 
kinases are known to be activated by Ca^V calmodulin 
binding to a small domain located just COOH-terminal 
to the catalytic domain, e.g., CaMKl, CaMK2, CaMK4, 
PhKy, MLCK, and twitchin. These enzymes and their 
close relatives are grouped together in a large family 
within the CaMK group. Also included in this family are 
a subfamily of plant enzymes (represented by CDPK) that 
contain an intrinsic calmodulin-like domain that confers 
Ca* + -dependent activation. The other family within the 
CaMK group is the Snfl/AMPK family. Within this fam- 
ily, substrate specificity determinant information has 
been obtained only for the AMP-activated protein kinase, 
which also shows a requirement for an NH2-terminal 
basic residue. The other major category of protein-serine 
kinases is the CMGC group. For the most part, these are 
proUne-directed enzymes, phosphorylating substrates at 
sites lying in Pro-rich environments. Available data for 
Gdc2 and Cdk2 indicate that members of the cyclin-de- 
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t kinase family require phosphate acceptors lying 
'f NH2-terminal to a Pro. A similar require- 
licated for the Erk (MAP) kinase family. The 
situation for the GSK3 family is more complicated, but 
most known acceptor sites he within Pro-rich regions. 
The structures of Cdk2 and Erk2 indicate that the pocket 
for the +1 residue is shallower than in PKA-Ca due to the 
replacement of Leu205 by an Arg, which is bulkier and 
precludes binding of the larger hydrophobic amino acids. 
In addition, the unique secondary amide group of Pro 
may make special interactions (44). The casein-kinase II 
family enzymes fail to conform to the proline-directed 
specificity exhibited by the other major families of this 
group, showing instead a strong preference for Ser resi- 
dues located NH2-terminal to a duster of acidic residues. 
The CMGC group protein kinases have larger-than-aver- 
age kinase domains due to insertions between subdo- 
mains X and XI, whose functional significance is 
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Also, the family of protein kinases involved in transla- 
tional control (HRI, PKR/Tik, Gcn2) appear to be basic 
amino acid-directed enzymes preferring Ser residues ly- 
ing NH2- terminal to an Arg. Finally, as mentioned pre- 
viously, the MEK/Ste7 family protein kinases and 
Weel/Mikl protein kinases exhibit a dual specificity. 

Although this classification is based solely on catalytic 
domain sequences, members of families defined by this 



ins are usually closely related in regions lying outside 
cataytic. domains and ' 



the cataytic. domains and in many cases have been shown 
functions. Thus, intercalation of 
newly discovered protein kinases into this classification 
should allow one to make useful predictions about the 
functions of such e 



The conventional protein- trosine kinase group in- 
cludes a large number of enzymes with quite closely re- 
lated kinase domains that specifically phosphorylate on 
Tyr residues (i.e., they cannot phosphorylate Ser or Thr). 
These enzymes, first recognized among retroviral onco- 

C:ins, have been found only in metazoan cells where 
are widely recognized for their roles in transducing 
growth and differentiation signals. Included in this group 
are more than a dozen distinct receptor families made up 
of membrane-spanning molecules that share similar over- 
all structural topologies, and nine nonreceptor far- !,! - 
loiecules 



also composed of structurally similar mol 
specificity determinants surrounding the Tyr piiuspuum.- 
ceptor sites have yet to be firmly established for these 
enzymes, but Glu residues either on the NH2- or COOH- 
terminal side of the acceptor are often preferred. This 
group is labeled "conventional" to distinguish it from 
other protein kinases (including Spkl, Clk, the MEK/Ste7 
family members, Weel/Mikl, ActRII, Hrr25, Esk, and 
SplA/DPyk2) reported to exhibit a dual specificity, that 
is, being capable of phosphorylating both Tyr and 
Ser/Thr residues (45). However, in most cases dual speci- 
ficity has been observed only for autophosphorylation 
reactions in vitro, and the only dual specificity protein 
kinases that are known to be able to phosphorylate a 
substrate on Ser/Thr and Tyr are members of the MEK 
family. Considered as a group, these dual-specificity pro- 
tein kinases are not particularly closely related to the 
conventional PTKs. Indeed, they seem to map through- 
out the phylogenetic tree (45), suggesting that the ability 
to autophosphorylate on Tyr may have had many inde- 
pendent origins during the evolutionary history of the 
superfamily. 

The protein kinases falling outside the four major 
groups are a mixed bag. Although the individual mem- 
bers within the defined families found in this "other" 
category clearly are related to one another through both 
structure and function, it is difficult to make broader 
generalizations that could group anv of these families 
together into a larger category. As far as substrate speci- 
ficity determinants go, little is known about most "other" 
category protein kinases, due primarily to their rather 
recent discovery and the paucity of . known physiological 
substrates. The casein kinase I family members, however, 
have been shown to prefer Ser/Thr residues located 
COOH-tenninal to a phosphoserine or phosphothreon- 
ine, although a stretch of acidic 



FUTURE PROSPECTS 

The rate of protein kinase discovery still shows no signs 
of abating. In addition to the continuing successes of 
homology-based approaches, genomic sequencing pro- 
jects are beginning to make significant contributions. For 
instance, the sequences of two entire budding yeast chro- 
mosomes (46, 47) and a "2 Mb stretch of C. eUgans chro- 
mosome III (48) have revealed a number of new putative 
protein kinase genes. As genome sequencing projects 
gather speed, the number of new protein kinase genes 
discovered in this way will undoubtedly mushroom. This 
explosion of sequence data is making it in 
cult to manage protein kinase databases 



o , of the sort de- 

les. The scribed here. Programs designed to align and derive 
relatedness trees are currently unable to handle the large 
number of available kinase domain sequences. New data 
handling programs will have to be developed to cope with 



large numbers of sequences like those of the eukaryotic 
protein kinase superfamily. 

Protein kinase catalytic domain structures will continue 
to be solved. The first structure of a conventional pro- 
tein-tyrosine kinase will be available shortly (see "Note 
added in proof), and this should reveal how Tyr is se- 
lected as an acceptor amino acid vs. Ser/Thr. Such struc- 
tures will enable comparative analysis to be carried out at 
the S-dimensional level, and allow predictions of struc- 
tures from primary sequences. Structural comparisons of 
catalytic domains with bound peptide substrates will also 
provide insights into substrate specificity. Most protein 
kinases show some degree of primary sequence specific- 
ity, and new methods are being developed to determine 
consensus sequence specificities for individual protein ki- 
nases (44). With such consensus information the struc- 
tural basis for the binding of a preferred peptide 
sequence to the cognate substrate binding site can then 
be deduced. In the future, it may be possible to model the 
3-dimensional structure of a novel protein kinase cata- 
lytic domain with sufficient accuracy to be able to deduce 
the preferred primary sequence surrounding the hy- 
droxyamino acid it phosphorylates, which in rum will 
allow one to predict what proteins might be its substrates 
from the increasingly complete database of protein se- 
quences.' E3 



Note added in proof: The crystal structure of the tyrosine kinase 
domain of the insulin receptor has now appeared (Hubbard, 
S. R., Wei, L., Ellis, L., and Hendrickson, W. A. ( 1994) Nature S72, 
746-754). 
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